虽然减少方差方法在解决大规模优化问题方面取得了巨大成功,但其中许多人遭受了累积错误,因此应定期需要进行完整的梯度计算。在本文中,我们提出了一种用于有限的和非convex优化的单环算法(梯度估计器的单环方法),该算法不需要定期刷新梯度估计器,但实现了几乎最佳的梯度复杂性。与现有方法不同,雪橇具有多功能性的优势。 (i)二阶最优性,(ii)PL区域中的指数收敛性,以及(iii)在较小的数据异质性下较小的复杂性。我们通过利用这些有利的特性来构建有效的联合学习算法。我们展示了输出的一阶和二阶最优性,并在PL条件下提供分析。当本地预算足够大,并且客户少(Hessian-)〜异质时,该算法需要较少的通信回合,而不是现有方法,例如FedAvg,脚手架和Mime。我们方法的优势在数值实验中得到了验证。
translated by 谷歌翻译
Transparency of Machine Learning models used for decision support in various industries becomes essential for ensuring their ethical use. To that end, feature attribution methods such as SHAP (SHapley Additive exPlanations) are widely used to explain the predictions of black-box machine learning models to customers and developers. However, a parallel trend has been to train machine learning models in collaboration with other data holders without accessing their data. Such models, trained over horizontally or vertically partitioned data, present a challenge for explainable AI because the explaining party may have a biased view of background data or a partial view of the feature space. As a result, explanations obtained from different participants of distributed machine learning might not be consistent with one another, undermining trust in the product. This paper presents an Explainable Data Collaboration Framework based on a model-agnostic additive feature attribution algorithm (KernelSHAP) and Data Collaboration method of privacy-preserving distributed machine learning. In particular, we present three algorithms for different scenarios of explainability in Data Collaboration and verify their consistency with experiments on open-access datasets. Our results demonstrated a significant (by at least a factor of 1.75) decrease in feature attribution discrepancies among the users of distributed machine learning.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
Removing reverb from reverberant music is a necessary technique to clean up audio for downstream music manipulations. Reverberation of music contains two categories, natural reverb, and artificial reverb. Artificial reverb has a wider diversity than natural reverb due to its various parameter setups and reverberation types. However, recent supervised dereverberation methods may fail because they rely on sufficiently diverse and numerous pairs of reverberant observations and retrieved data for training in order to be generalizable to unseen observations during inference. To resolve these problems, we propose an unsupervised method that can remove a general kind of artificial reverb for music without requiring pairs of data for training. The proposed method is based on diffusion models, where it initializes the unknown reverberation operator with a conventional signal processing technique and simultaneously refines the estimate with the help of diffusion models. We show through objective and perceptual evaluations that our method outperforms the current leading vocal dereverberation benchmarks.
translated by 谷歌翻译
Score-based generative models learn a family of noise-conditional score functions corresponding to the data density perturbed with increasingly large amounts of noise. These perturbed data densities are tied together by the Fokker-Planck equation (FPE), a PDE governing the spatial-temporal evolution of a density undergoing a diffusion process. In this work, we derive a corresponding equation characterizing the noise-conditional scores of the perturbed data densities (i.e., their gradients), termed the score FPE. Surprisingly, despite impressive empirical performance, we observe that scores learned via denoising score matching (DSM) do not satisfy the underlying score FPE. We mathematically analyze three implications of satisfying the score FPE and a potential explanation for why the score FPE is not satisfied in practice. At last, we propose to regularize the DSM objective to enforce satisfaction of the score FPE, and show its effectiveness on synthetic data and MNIST.
translated by 谷歌翻译
我们已经开发了带有被动动态步行机制的双头机器人。这项研究提出了一个指南针模型,其摇摆质量连接到上半身,并沿水平方向振荡,以阐明上半身水平动力学对两足动物行走的影响。该模型的极限周期进行了数值搜索,并研究了它们的稳定性和能源效率。根据支持摇摆质量的弹簧常数,获得了几个不同的极限周期。特定类型的解决方案降低了稳定性,同时降低了意外下降并提高能源效率的风险。获得的结果归因于摇摆的质量朝与上半身相反的方向移动,从而防止行走时加速和减速的大幅变化。研究了所提出的模型的运动与实际的双头机器人与人类步态之间的关系。
translated by 谷歌翻译
期望 - 最大化(EM)算法是一种简单的元叠加,当观察到的数据中缺少测量值或数据由可观察到的数据组成时,它已多年来用作统计推断的方法。它的一般属性进行了充分的研究,而且还有无数方法将其应用于个人问题。在本文中,我们介绍了$ em $ $ and算法,EM算法的信息几何公式及其扩展和应用程序以及各种问题。具体而言,我们将看到,可以制定一个异常稳定推理算法,用于计算通道容量的算法,概率单纯性的参数估计方法,特定的多变量分析方法,例如概率模型中的主要组件分析和模态回归中的主成分分析,基质分解和学习生成模型,这些模型最近从几何学角度引起了深度学习的关注。
translated by 谷歌翻译
多源数据融合,共同分析了多个数据源以获得改进的信息,引起了广泛的研究关注。对于多个医疗机构的数据集,数据机密性和跨机构沟通至关重要。在这种情况下,数据协作(DC)分析通过共享维数减少的中间表示,而无需迭代跨机构通信可能是合适的。在分析包括个人信息在内的数据时,共享数据的可识别性至关重要。在这项研究中,研究了DC分析的可识别性。结果表明,共享的中间表示很容易识别为原始数据以进行监督学习。然后,这项研究提出了一个非可读性可识别的直流分析,仅共享多个医疗数据集(包括个人信息)的非可读数据。所提出的方法基于随机样本排列,可解释的直流分析的概念以及无法重建的功能的使用来解决可识别性问题。在医学数据集的数值实验中,提出的方法表现出非可读性可识别性,同时保持了常规DC分析的高识别性能。对于医院的数据集,提出的方法在仅使用本地数据集的本地分析的识别性能方面表现出了9个百分点的改善。
translated by 谷歌翻译
在本文中,我们通过使用实例分割来生成更尖锐的注意图以进行动作识别,提出了注意分支网络(ABN)的扩展。视觉解释的方法(例如Grad-CAM)通常会产生模糊的地图,这些图对人类的理解不是直观的,尤其是在识别视频中人们的行为时。我们提出的方法ABN通过引入新的面膜丢失来解决此问题,该掩模损失使生成的注意图接近实例分割结果。此外,引入了PC丢失和多个注意图,以增强地图的清晰度并提高分类的性能。UCF101和SSV2的实验结果表明,通过所提出的方法生成的地图在定性和定量上比原始ABN的图更清晰。
translated by 谷歌翻译
最近,许多作品探索了SIM到真实传递的可传递视觉模型预测性控制(MPC)。但是,这样的作品仅限于一次性转移,必须收集一次现实世界的数据才能执行SIM到实现的传输,这仍然是一项重大的人类努力,在将模拟中学到的模型转移到真实的新域中所学的模型世界。为了减轻这个问题,我们首先提出了一个新型的模型学习框架,称为Kalman随机到典型模型(KRC模型)。该框架能够从随机图像中提取与任务相关的内在特征及其动力学。然后,我们建议使用KRC模型的Kalman随机到典型模型预测控制(KRC-MPC)作为零射击的SIM到真实转移视觉MPC。通过仿真和现实世界中的机器人手和模拟中的块配合任务,通过机器人手通过机器人手来评估我们方法的有效性。实验结果表明,KRC-MPC可以以零拍的方式应用于各种真实域和任务。
translated by 谷歌翻译